3574 results found.
Written
Corpus,
Language Type:
Monolingual
Languages:
Bulgarian Croatian Czech Danish Dutch English Estonian Finnish French German Greek Hungarian Icelandic Irish Italian Latvian Lithuanian Maltese Polish Portuguese Romanian Slovak Slovenian Spanish Swedish
Availability:
Freely Available
License:
CC-0
Size:
341856530 sentences Production Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:ParaCrawl: Web-Scale Acquisition of Parallel Corpora
-
Paper track:Long/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Philipp Koehn | ParaCrawl | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC User Agreement for Non-Members
Size:
77000000 words Production Status:
Existing-used
Use:
Summarisation
-
Paper title:Improving Truthfulness of Headline Generation
-
Paper track:Long/Summarization
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Kazuki Matsumaru | Annotated English Gigaword | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution 4.0 International
Size:
1271 entries Production Status:
Newly created-finished
Use:
Text Mining
-
Paper title:Efficient Pairwise Annotation of Argument Quality
-
Paper track:Long/Sentiment Analysis, Stylistic Analysis, and Argum
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Lukas Gienapp | Webis-ArgQuality-20 | /N |
Documentation:
English documentation is available at https://webis.de/data/webis-argument-quality-20.html
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
107,372 entries Production Status:
Existing-used
Use:
Summarisation
-
Paper title:Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
-
Paper track:Short/Summarization
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sajad Sotudeh | MIMIC-CXR | /N |
Documentation:
None
Written
Treebank,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
Size:
None Production Status:
Existing-used
Use:
Discourse
-
Paper title:Implicit Discourse Relation Classification: We Need to Talk about Evaluation
-
Paper track:Short/Discourse and Pragmatics
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Najoung Kim | Penn Discourse Treebank 3.0 | /N |
Documentation:
https://catalog.ldc.upenn.edu/docs/LDC2019T05/PDTB3-Annotation-Manual.pdf
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
113 MByte Production Status:
Newly created-finished
Use:
Natural Language Generation
-
Paper title:Neural {CRF} Model for Sentence Alignment in Text Simplification
-
Paper track:Long/Generation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chao Jiang | Wiki-Auto | /N |
Documentation:
The documentation is in the Github repository. It is in English and publicly available.
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
158.9 MByte Production Status:
Newly created-finished
Use:
Natural Language Generation
-
Paper title:Neural {CRF} Model for Sentence Alignment in Text Simplification
-
Paper track:Long/Generation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chao Jiang | Wiki-Manual | /N |
Documentation:
The documentation is in the Github repository. It is in English and publicly available.
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
5 MByte Production Status:
Existing-used
Use:
Question Answering
-
Paper title:Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering
-
Paper track:Long/Question Answering
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Vikas Yadav | Question Answering using Sentence Composition (QASC) | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
10 MByte Production Status:
Existing-used
Use:
Question Answering
-
Paper title:Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering
-
Paper track:Long/Question Answering
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Vikas Yadav | Multi-sentence Reading Comprehension | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution 4.0 International License
Size:
210,532 tokens Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation
-
Paper track:Short/Information Retrieval and Text Mining
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Aakanksha Naik | LitBank | /N |
Documentation:
https://github.com/dbamman/litbank




